Most recent studies on neural constituency parsing focus on encoder structures, while few developments are devoted to decoders. Previous research has demonstrated that probabilistic statistical methods based on syntactic rules are particularly effective in constituency parsing, whereas syntactic rules are not used during the training of neural models in prior work probably due to their enormous computation requirements. In this paper, we first implement a fast CKY decoding procedure harnessing GPU acceleration, based on which we further derive a syntactic rule-based (rule-constrained) CKY decoding. In the experiments, our method obtains 95.89 and 92.52 F1 on the datasets of PTB and CTB respectively, which shows significant improvements compared with previous approaches. Besides, our parser achieves strong and competitive cross-domain performance in zero-shot settings.
translated by 谷歌翻译
Zero-shot cross-lingual named entity recognition (NER) aims at transferring knowledge from annotated and rich-resource data in source languages to unlabeled and lean-resource data in target languages. Existing mainstream methods based on the teacher-student distillation framework ignore the rich and complementary information lying in the intermediate layers of pre-trained language models, and domain-invariant information is easily lost during transfer. In this study, a mixture of short-channel distillers (MSD) method is proposed to fully interact the rich hierarchical information in the teacher model and to transfer knowledge to the student model sufficiently and efficiently. Concretely, a multi-channel distillation framework is designed for sufficient information transfer by aggregating multiple distillers as a mixture. Besides, an unsupervised method adopting parallel domain adaptation is proposed to shorten the channels between the teacher and student models to preserve domain-invariant features. Experiments on four datasets across nine languages demonstrate that the proposed method achieves new state-of-the-art performance on zero-shot cross-lingual NER and shows great generalization and compatibility across languages and fields.
translated by 谷歌翻译
Multilingual end-to-end models have shown great improvement over monolingual systems. With the development of pre-training methods on speech, self-supervised multilingual speech representation learning like XLSR has shown success in improving the performance of multilingual automatic speech recognition (ASR). However, similar to the supervised learning, multilingual pre-training may also suffer from language interference and further affect the application of multilingual system. In this paper, we introduce several techniques for improving self-supervised multilingual pre-training by leveraging auxiliary language information, including the language adversarial training, language embedding and language adaptive training during the pre-training stage. We conduct experiments on a multilingual ASR task consisting of 16 languages. Our experimental results demonstrate 14.3% relative gain over the standard XLSR model, and 19.8% relative gain over the no pre-training multilingual model.
translated by 谷歌翻译
The role of mobile cameras increased dramatically over the past few years, leading to more and more research in automatic image quality enhancement and RAW photo processing. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based image signal processing (ISP) pipeline replacing the standard mobile ISPs that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale Fujifilm UltraISP dataset consisting of thousands of paired photos captured with a normal mobile camera sensor and a professional 102MP medium-format FujiFilm GFX100 camera. The runtime of the resulting models was evaluated on the Snapdragon's 8 Gen 1 GPU that provides excellent acceleration results for the majority of common deep learning ops. The proposed solutions are compatible with all recent mobile GPUs, being able to process Full HD photos in less than 20-50 milliseconds while achieving high fidelity results. A detailed description of all models developed in this challenge is provided in this paper.
translated by 谷歌翻译
In constituency parsing, span-based decoding is an important direction. However, for Chinese sentences, because of their linguistic characteristics, it is necessary to utilize other models to perform word segmentation first, which introduces a series of uncertainties and generally leads to errors in the computation of the constituency tree afterward. This work proposes a method for joint Chinese word segmentation and Span-based Constituency Parsing by adding extra labels to individual Chinese characters on the parse trees. Through experiments, the proposed algorithm outperforms the recent models for joint segmentation and constituency parsing on CTB 5.1.
translated by 谷歌翻译
激活函数是元素的数学函数,在深神经网络(DNN)中起着至关重要的作用。已经提出了许多新颖和复杂的激活功能来提高DNN的准确性,但在训练过程中还可以通过反向传播消耗大量记忆。在这项研究中,我们提出了嵌套的正向自动分化(正向AD),专门针对用于记忆效率的DNN训练的元素激活函数。我们在两个广泛使用的深度学习框架(Tensorflow和Pytorch)中部署了嵌套的AD,分别支持静态和动态计算图。我们的评估表明,在相同的记忆降低率下,嵌套的前AD嵌套将记忆足迹降低到1.97倍,比基线模型降低了20%。
translated by 谷歌翻译
量化是一种降低DNN模型的计算和记忆成本的技术,DNN模型越来越大。现有的量化解决方案使用固定点整数或浮点类类型,这些量子的好处有限,因为两者都需要更多位以保持原始型号的准确性。另一方面,可变长度量化使用低位量化对正常值和高精度的分数对异常值的一部分。即使这项工作带来了算法的好处,但由于长度的编码和解码,它也引入了重要的硬件开销。在这项工作中,我们提出了一种称为ANT的固定长度自适应数值数据类型,以通过微小的硬件开销实现低位量化。我们的数据类型ANT利用了两项关键创新来利用DNN模型中的张贴内和调整的自适应机会。首先,我们提出了一种特定的数据类型Flint,该数据类型结合了Float和INT的优势,以适应张量中不同值的重要性。其次,我们提出了一个自适应框架,该框架根据其分布特性选择每个张量的最佳类型。我们为蚂蚁设计了统一的处理元件体系结构,并显示其与现有DNN加速器的易于集成。我们的设计导致2.8 $ \ times $速度和2.5 $ \ times $ $ $ $ $ \ times $ $ \ times $ $ \ times $ $ \ times $ $ \ times $ $ \ times $ $ \ times $ $ \ times $比最先进的量化加速器提高了能源效率。
translated by 谷歌翻译
本文回顾了AIM 2022上压缩图像和视频超级分辨率的挑战。这项挑战包括两条曲目。轨道1的目标是压缩图像的超分辨率,轨迹〜2靶向压缩视频的超分辨率。在轨道1中,我们使用流行的数据集DIV2K作为培训,验证和测试集。在轨道2中,我们提出了LDV 3.0数据集,其中包含365个视频,包括LDV 2.0数据集(335个视频)和30个其他视频。在这一挑战中,有12支球队和2支球队分别提交了赛道1和赛道2的最终结果。所提出的方法和解决方案衡量了压缩图像和视频上超分辨率的最先进。提出的LDV 3.0数据集可在https://github.com/renyang-home/ldv_dataset上找到。此挑战的首页是在https://github.com/renyang-home/aim22_compresssr。
translated by 谷歌翻译
最近,后门攻击已成为对深神经网络(DNN)模型安全性的新兴威胁。迄今为止,大多数现有研究都集中于对未压缩模型的后门攻击。尽管在实际应用中广泛使用的压缩DNN的脆弱性尚未得到利用。在本文中,我们建议研究和发展针对紧凑型DNN模型(RIBAC)的强大和不可感知的后门攻击。通过对重要设计旋钮进行系统分析和探索,我们提出了一个框架,该框架可以有效地学习适当的触发模式,模型参数和修剪口罩。从而同时达到高触发隐形性,高攻击成功率和高模型效率。跨不同数据集的广泛评估,包括针对最先进的防御机制的测试,证明了RIBAC的高鲁棒性,隐身性和模型效率。代码可从https://github.com/huyvnphan/eccv2022-ribac获得
translated by 谷歌翻译
高动态范围(HDR)成像是图像处理中的一个基本问题,即使在场景中存在不同的照明的情况下,它旨在产生暴露良好的图像。近年来,多曝光融合方法已取得了显着的结果,该方法合并了多个具有不同暴露的动态范围(LDR)图像,以生成相应的HDR图像。但是,在动态场景中综合HDR图像仍然具有挑战性,并且需求量很高。生产HDR图像有两个挑战:1)。 LDR图像之间的对象运动很容易在生成的结果中引起不良的幽灵伪像。 2)。由于在合并阶段对这些区域的补偿不足,因此下区域和过度曝光的区域通常包含扭曲的图像含量。在本文中,我们提出了一个多尺度采样和聚合网络,用于在动态场景中进行HDR成像。为了有效地减轻小动作和大型动作引起的问题,我们的方法通过以粗到精细的方式对LDR图像进行了暗中对齐LDR图像。此外,我们提出了一个基于离散小波转换的密集连接的网络,以改善性能,该网络将输入分解为几个非重叠频率子带,并在小波域中自适应地执行补偿。实验表明,与其他有希望的HDR成像方法相比,我们提出的方法可以在不同场景下实现最新的性能。此外,由我们的方法生成的HDR图像包含清洁剂和更详细的内容,扭曲较少,从而带来更好的视觉质量。
translated by 谷歌翻译